Extracting Common Motifs under the Levenshtein Measure: Theory and Experimentation

نویسندگان

  • Ezekiel F. Adebiyi
  • Michael Kaufmann
چکیده

Using our techniques for extracting approximate non-tandem repeats[1] on well constructed maximal models, we derive an algorithm to find common motifs of length P that occur in N sequences with at most D differences under the Edit distance metric. We compare the effectiveness of our algorithm with the more involved algorithm of Sagot[17] for Edit distance on some real sequences. Her method has not been implemented before for Edit distance but only for Hamming distance[12, 20]. Our resulting method turns out to be simpler and more efficient theoretically and also in practice for moderately large P and D.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Higher Order Motifs under the Levenshtein Measure

We study the problem of finding higher order motifs under the levenshtein measure, otherwise known as the edit distance. In the problem set-up, we are given sequences, each of average length , over a finite alphabet and thresholds and , we are to find composite motifs that contain motifs of length (these motifs occur with atmost differences) in distinct sequences. Two interesting but involved a...

متن کامل

Extracting semantic clusters from the alignment of definitions

Through tile alignment of definitions fronl two or more dilTerent sources, it is possible to retrieve pairs of words that can be used indistinguishably in the same sentence without changing tile meaning of the concept. As lexicographic work exploits common defining schemes, such as genus and dilTerentia, a concept is simihu'ly defined by different dictionaries. The dilTerence in words used betw...

متن کامل

Norwegian Dialects Examined Perceptually and Acoustically WILBERT HEERINGA

Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared to Levenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions, dele...

متن کامل

A Fast Algorithm for the Inexact Characteristic String Problem

We present a new algorithm to solve the INEXACT CHARACTERISTIC STRING PROBLEM using Hamming distance instead of Levenshtein distance as a measure. We embed our new algorithm and the previously known algorithm for Levenshtein distance in a common framework which reveals an additional improvement to the Levenshtein distance algorithm. The INEXACT CHARACTERISTIC STRING PROBLEM can thus be solved i...

متن کامل

An Analytical Study on Calligraphic, Human and Vegetal Motifs in Some Examples of Enameled Glasses in Egypt and Syria (Mamluke Period) in Comparison with Iranian Metalwork (Ilkhanid and Timurid Periods)

Throughout history, artworks in the field of metalwork and glasswork reflect different themes. They are considered as important means of manifesting Islamic art and traditional crafts in different countries which have been producing a wide variety of art products. Meanwhile, the influence of some kinds of artworks from different lands and the counterinfluence of concepts and artistic themes amo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002